Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Analysis of large-scale distributed machine learning systems: a case study on LDA

TANG Lizhe, FENG Dawei, LI Dongsheng, LI Rongchun, LIU Feng

Journal of Computer Applications 2017, 37 (3): 628-634. DOI: 10.11772/j.issn.1001-9081.2017.03.628

Abstract （924）

PDF （1169KB）（568）

Save

Aiming at the problems of scalability, algorithm convergence performance and operational efficiency in building large-scale machine learning systems, the challenges of the large-scale sample, model and network communication to the machine learning system were analyzed and the solutions of the existing systems were also presented. Taking Latent Dirichlet Allocation (LDA) model as an example, by comparing three open source distributed LDA systems-Spark LDA, PLDA+ and LightLDA, the differences in system design, implementation and performance were analyzed in terms of system resource consumption, algorithm convergence performance and scalability. The experimental results show that the memory usage of LightLDA and PLDA+ is about half of Spark LDA, and the convergence speed is 4 to 5 times of Spark LDA in the face of small sample sets and models. In the case of large-scale sample sets and models, the network communication volume and system convergence time of LightLDA is much smaller than PLDA+ and SparkLDA, showing a good scalability. The model of "data parallelism+model parallelism" can effectively meet the challenge of large-scale sample and model. The mechanism of Stale Synchronous Parallel (SSP) model for parameters, local caching mechanism of model and sparse storage of parameter can reduce the network cost effectively and improve the system operation efficiency.

Reference | Related Articles | Metrics